UNIFIED APPROACH FOR MINIMIZING COMPOSITE NORMS 

N. S. AYBAT* AND G. lYENGARt 

Abstract. We propose a first-order augmented Lagrangian algorithm (FALC) to solve the composite norm minimization 
problem 

minxGR^x" Mi|k{^)l|Q +/i2||C{X) -d||/3, 
subject to ||yi(X) — fe||^ < p, 

where cr(X) denote the vector of singular values of the matrix X £ R""^", the matrix norm ||(t(X)||ci denotes either the 

Frobenius, the nuclear, or the £2-operator norm of the matrix X, the vector norms ||.||^, ||.||7 denote either the £i-norm, 

1»^ ' £2-norm or the ^cxD-norm; and yl(.)i C(.) are linear operators from IR™>^" to vector spaces of appropriate dimensions. This 

^"^ ' formulation includes as special cases problems such as basis pursuit, matrix completion, robust PCA, and stable PCA. Thus, 

^ ^ ' the FALC is able to solve all these problems in a unified manner. 

^ , FALC solves this semidefinite optimization problem by inexactly solving a sequence of problems of the form 

£ ■ mini '*';r:"^"" + \"-^(^)+^-'-CD! ■■XeR-x-,seR'', \\yh < p 

for an appropriately chosen sequence of multipliers {A^ ',6^ ,02 }fcsz, • Each of these non-smooth subproblems are solved 

inexactly using Algorithm 3 in I33| where each update involves computing at most one singular value decomposition (SVD). 

We show that FALC converges to the optimal solution X, of the composite norm minimization problem whenever the optimal 
^ ii J ' solution is unique. We also show that there exists a priori fixed sequence {A'^^'lfegz, such that for all e > 0, iterates X^'^' 

computed by FALC are e-feasible and e-optimal after O (log(-)) iterations, which requires 0{ — ) operations in total where the 

complexity of each operation is dominated by computing a singular value decomposition. We also show that FALC can be 
,S^ • extended to solve problems where, in addition to the constraints above, we have constraints of the form J^{X) -< G where J^(.) 

is linear operator and ^ denotes the partial order with respect to the cone of positive semidefinite matrices. All the convergence 

properties of FALC continue to hold for this more general problem. 
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1. Introduction. We propose a first-order augmented Lagrangian algorithm (FALC) to solve the class 
^S] ■ of composite norm minimization problems defined as follows: 

> 

m; min ^i||cr(X)||a + ^2||C(X) -d||^ subject to ll^(X) -6||t, < p, (1.1) 

Tt ; where a{X) e ]r'|"'H™:»> denotes the vector of singular values of the matrix X e M™><", b eW, d e W, 

ly-^ . A : K™^" — > M9, q < mn, and C : M™><" -> Rp, p < mn, are linear operators and a,/3,7 g {1,2,(X)}. 

f^ ' For a € {1,2, oo} the vector norm ||.||a denotes the ^i-norm, £2-iiorm or the iJoo-norm, repectively. Since 

^D . the Nuclear norm \\X\\^ = ||cr(X)||i, the Frobenius norm ||^||f = ||o'(X)||2, and the .^2-operator norm 

ll-^'^lb = ||o'(-'^)||oo, the term ||(t(X)||q denotes either the nuclear, the Frobenius, or the .^2-operator norm. 

We assume that A has full rank - we do not need this constraint on the operator C. Although we focus on 
i>^ " establishing the properties of FALC for problems of the form ()l.ip . we show in Section [5] that our proposed 

'Oj . framework extends to a much larger class of problems. We show below that many well studied optimization 

C^ ' problems are special cases of (|l.ip . 

Nuclear norm-minimization. The special case with a = 1, i.e. ||i7(A)||i = ||A||*, fj,2 = 0, and p = 0, 

is known as nuclear norm minimization problem 

min ||A||* subject to A{X) = b. (1.2) 

Nuclear norm minimization problem is a convex approximation for the NP-hard rank minimization problem 
minxgR™xn{rank(A) : A{X) — b}, where rank(A) denotes the rank of A e M™^". Rank minimization 
arises in many different contexts, e.g. system identification |28j , optimal control |15[|161fT4] . low-dimensional 
embedding in Euclidean space [27] , and matrix completion. Matrix completion is the special case where the 
operator A picks a subset of the matrix elements, i.e the linear constraints are of the form A^ = Mij for 
(i, j) G Q. The Netflix prize problem |30j is an example of the matrix completion problem. Recently, Recht 
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et al. [21] have shown that when the hnear operator A : R™^" — > M? satisfies some regularity properties, 
and the number of measurements q = 0{r{m + n)ln(mn)), the optimal solution of the SDP (11.21) is the 
optimal solution of the rank minimization problem with very high probability. Thus, FALC can be used 
to approximately solve rank minimization problems. For existing algorithmic methodologies for solving the 
nuclear norm minimization problem see [H [THl [IS UHl HH 131] and references therein. 

Basis-pursuit problem. The special case of (jl.ip with ^ = 1, /zi = 0, p = 0, C{X) — X ^ K"^^ and 
d = 0, is known is the basis pursuit problem 

min ||a;|ji subject to Ax = b, (1-3) 

where A G R*^" and b G W. LPs of the form (|1.3|) have recently attracted a lot of attention since they 
appear in the context of a new signal processing paradigm known as compressive sensing (CS) [SJ [S] [T] 113) . 
The goal in CS is to recover a sparse signal xq from a small set of linear measurements or transform values 
b = Axq, or equivalently, to solve the NP-hard ^o-minimization problem 

min |ja;||o subject to Ax — b, (1-4) 

where the ^g-norm ||a;||o = J2"=i '^i^i ¥" 0)- Recently, Candes, Romberg and Tao [S] [SJ [7] and Donoho [13] 
have shown that, when the target signal xq is s-sparse, i.e. only s of the n components are non-zeros, the 
matrix A £ R"*^" has q — C'(sln(n)) and is chosen randomly according to a specified set of distributions, 
the sparse target signal xq is the optimal solution of the LP (jl.Sp with very high probability. Thus, xq 
can be recovered by solving an LP, and therefore, in theory signal recovery is very efficient. In practice, 
however, solving such LPs is hard because the matrix A in (jl.31) is large and dense, and in addition these 
LPs are often ill-conditioned. Thus, general purpose simplex-based LP solvers are not able to efficiently 
solve (jl.3p . The measurement matrix A in CS applications has a lot of structure, in particular the matrix- 
vector multiplication Ax and A^'^y can be computed efficiently. Recently, a number of different algorithms 
have been proposed to exploit this structural fact to efficiently solve dO]) [T|[^P^P71P^[^[^[M 1 [55 1 [55 ] . 
Principal component pursuit. The special case of (|l.ip with a = 1, i.e. ||cr(X)||Q, — \\X\\^ denoting 
the nuclear norm, /3 = 1, ^i = 1, /i2 > 0, yt = 0, 6 = 0, p = and the operator C : R™><" -> R™" such that 
C{X) = vec(X), where vec{X) is vector obtained by stacking the columns of X G r™x« [j^ order, is the 
principal component pursuit problem 

min \\X\\^+fi2\\vec{X)-d\\i. (1.5) 

In [5] [5S], it is shown that when the data matrix D e R"'^" is of the form D = Xq + Sq, where Xq is a low 
rank matrix and 5*0 is a sparse matrix, then one can recover the low rank and sparse components of D by 
solving the problem given in (II. 5p for an appropriately chosen ^2 ■ In [37] , it was shown that the recovery is 
still possible even when the data matrix is corrupted with a dense error matrix. When the data matrix D is of 
the form D = Xq -I- 5*0-1-10) where Xq is a low rank matrix, Sq is a sparse matrix and {(l^o)ij} is independent 
and identically distributed for all i,j such that ||1o||f < P, solving the stable principal component pursuit 
problem 

minx,seR™x„ ||X||* + pall vec(5)||i, , . 

subject to \\X + S-D\\f<P. ^ ' 

produces (X*, 5*) such that ||X* — Xo|||. + US'* — ^oHl^ < Cmnp^ for some constant C with high probability. 
Principal component pursuit and stable principle component pursuit both have applications in video surveil- 
lance and face recognition. For existing algorithmic approaches to solving principal component pursuit see 
[8] [H [251 [261 [37] and references therein. 

Matrix completion problems with semidefinite constraints. In Section [5] we show that FALC 
can be extended to solve can solve a larger class of optimization problems of the form 

minx6R"x„ fii\\cr{X)\\c,+iJ2\\C{X) - d||^ + {R, X) , 

(1.7) 



subject to ||.4(X) — 6||^ < p. 



GiX) - h, 
F{X) ^ G, 
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where Xq, R E R™^", ^ is a linear operator that maps X to a vector, T is hnear operator that maps X to 
a symmetric matrix, and :< denotes the partial order induced by the cone of positive semidefinite matrices. 
In the sparse PCA problem 

min^gRn ||S — xx'^Wf subject to ||a;||o < s, (1-8) 

the goal is to compute an s-sparse vector x that is "close" to the eigenvector corresponding to the largest 
eigenvalue of the positive semidefinite matrix E. The optimization problem (|1.8p is not convex, and is, 
therefore, hard to solve. Let X = xx'^ . Then (II. 8p is equivalent to minxgR™xTi{||X — E||i? : || vec(X)||o < 
s^,rank(X) = 1,X >: 0}. Since \\X\\^, is the tightest convex upper bound for rank(X), and ||X||* = Tr(X) 
for positive semidefinite matrices, the relaxed problem 



min;feR„x„ \\X - ^p + /.|| vec(X)|ji + i^ {I,X) 
subject to X yO. 



(1.9) 



is a convex approximation for (|1.8p . where /i and v control the sparsity on the entries and the singular values 
of X, respectively. See [9l[T0l[21] for existing approaches for solving the sparse PCA problem. 

Problems of the form (|1.7p also appear in signal shaping applications. One such problem is the design 
of the optimal acquisition basis for radar applications. For simplicity assume that we are in a 1-D setting, 
and we discretize the space. Let x S R^ denote the unknown locations of objects (i.e. reflectors). Let d{t) 
denote the signal transmitted by the radar. Then the received signal y S M^ is of the form 

y{t)^J2d{t-Tk)xk+Vt, t^l,...,N. 

k 

where t^ denotes the round-trip delay corresponding to the /c-th discrete location, and rjt ^ A/'(0, ct^) denotes 
the receiver noise. Thus, y — Dx + rj E M^, where the columns oi D E M^^^ are impulse response of the 
channel for different delays, and rj ~ Af{0,(T^I). Typically, x is very sparse, i.e. ||a;||o << L, therefore, one 
could recover x by solving the CS-like LP min{|ja::||i : ||y — Dx\\2 < <j}- The power cost of this approach 
is likely to be high since the matched filter and the analog-to-digital converter (ADC) has to run at a very 
high rate to generate y{t) for all t. Since x is sufficiently sparse, it is possible that x can be recovered from a 
lower dimensional projection Wy of the observations y by solving the LP min{||a;||i : WDx = Wy} for some 
W E R*^^^, where M < N. Such a projection saves device power because the matrix multiplication Wy is 
implemented as a matched filter in the analog domain and the ADC only needs to convert the product. For 
good performance of this strategy, one requires the following properties, 
(i) small row dimension of W: this ensures a low dimensional transformed signal Wy. 

(ii) small mutual incoherence ma.x{\(D^W^WD)ij\ : dia.g{D^W^WD) = 1} of the measurement matrix 
WD: this ensures that x can be reliably recovered by the LP. 

(iii) small noise power a'^Tr{W'^W) of the compressed signal Wy. 

Let K — W^W >: 0. Since the nuclear ||A'||* = Tr{K) is good approximation for rank(ii') = rank(M^), a 
good projection matrix W can be computed by solving the SDP 

min^fgRNxJv /xi {I,K) + ^i2\\veCoffiD'^ KD)\\oo, 
subject to diagiD^KD) = I (1.10) 

KhO, 

where diag(X) denote a diagonal matrix with entries given by the diagonal elements of X, veCo//(X) = 
vec{X — diag(X)), and / is an identity matrix of size TV. 

1.1. FALC approach and summary of results. The composite norm minimization problem (|l.ip 
can be reformulated as a semidefinite program (SDP), and can, therefore, in theory, be solved efficiently [3 . 
However, for practical instances the resulting SDPs are large and typically dense. Therefore, interior point 
based SDP solvers perform very poorly on these instances. Recently a number of different first-order or 
restricted memory methods have been proposed for solving special cases of (jl.ip . In the previous section 
we provide references to the existing literature on algorithmic approaches for solving many of these special 
cases. 



We propose a first-order augraented Lagrangian algorithm (FALC) to solve (11.11) and show that FALC 
can be extended easily to solve Problem (|1.7|) with the same complexity guarantees. In Section[2l we establish 
the convergence properties of FALC for the optimization problem (jl.ip and later in Section [SJ we briefly 
describe the extension to the more general problem (|1.7p . 

To obtain separable and efficiently solvable subproblems, we introduce a slack variables s and y, and 
reformulate (|1.1|) as 



Linx6R."x„ f^i\\(y{X)\ 


a + 


A^2|ls||/3, 




subject to C{X) 


+ 


s 


= d, 


AiX) 


+ 


y 


= b, 






II2/II7 


< P- 



(1.11) 



We solve (|1.11[) by inexactly solving a sequence of optimization problems of the form 



JsTSl 



mm 

,seR'',y:\\y\\~,<p 



- XC^Hi'^iAiX) +y-b) + ip(X) + y - 5||1 

- A('=)(0f )^(C(X) + s-d) + i||C(X) + s-d\\l 



(1.12) 



for an appropriately chosen sequence of parameters {{\'-'^\9l ,62 )}kez+- We solve these subproblems 
using Algorithm 3 in [33] where in each update step we need to solve one problem of the form 

AW(/il|k(X)|U + Ai2||s||;3) 



xei 



mm 

,seRP.y:\\y\\-,<p 



\X 



xn 



r \7xf^''HX,s,y) 1 


1 


' X -X ' 




V./«(X,s,y) 




s ~ s 




L V,/W(l,s,y) J 




y-y 




^y-yWl 






t 



(1.13) 



where 



f^\X, s, 2/) = - AW(0W)T(_4(^) + y _ fe) + 1\\A{X) + y - 6| 
_ AW(0f y (C(X) + s - rf) + \\\C[X) + s - d||; 



denotes the "smooth" part of the objective function in (J1.12p . Note that (|1.13p is separable in AT, s and y, 
and it reduces to one vector "shrinkage" pi] (or constrained "shrinkage", see (|2.30p ') in s and in y, and one 
"matrix shrinkage" 29 (or constrained "matrix shrinkage", see (I2.29p ) in X. 

In this paper we establish the following properties for the FALC algorithm, 
(a) Every hmit point X of the FALC iterates {X^''^ is an optimal solution of (|l.ip . i.e. 



X € argmin 



xei 



m {/iilk(^)IU + M2||C(A) -dWp: \\AiX) - 6||^ < p}. 



(b) Suppose p.ip has a unique optimal solution A*. There exist a priori fixed sequence {A^*^^ : fc > 1} such 
that for all e > 0, iterates A*^*^^ computed by FALC are e-feasible and e-optimal, i.e. 



,(fc)| 



< 



^(xW)+yW-6||2<e, 
(/ii||a(A«)|U + M2||C(aW) _ d||^) - (Mi||a(A,)|U + A*2||C(A,) - d\\p) 



<e, 



I riTO^, n^TTi}). 



after 0{e~^) iterations, where the complexity of each iteration is 0(min{r 
This paper is organized as follows. In Section [5] we prove the main convergence results for FALC and 
in Section |3] we discuss all the implementation details of FALC. In Section |4] we report the results from 
our numerical experiments comparing FALC with other algorithms to solve principle component pursuit 
problems. Finally, in Section [3 we briefly discuss the general problem (|1.7p and conclude. 
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f+1) ^ Qik) . 




'-b 


,(.+ 1) ^ ^(fe) _ 


C(x('=')+s<'=' 


-d 



Outline of First-Order Augmented Lagrangian Algorithm 

input: multipliers { (A''''), e'*^), r^, ^W) j^^^ , X(") G R™><", s(") G R^, y(°) G M« 
77-/ii||a(X(0))|U+^2||C(X(0))-d||^ 
61^1) = 0, 6*^^^ =0, fc -^ 
while (Stopping Criterion not true) 
do 

fc -s- fc + 1 

/W(X,s,2/) := iM(X) + y-b- A^^f ^H^ + i||C(X) + s - d - AWfl^'^^Hi 

P«(X,s,y):=AW(Aii||a(X)|U + Ai2p||^) + /«(X,s,y) 

ftW(X,s,y) := i||X - XC^-DIII, + i||s- sC^-DlJi + i ||y - y(fc-i)||2 

1 Use Algorithm 3 in 33^ with h'-''\X,s,y) and initial iterate {X^''-'^\s^''-^\y'-''-'^'>) 

compute {X^^\s''''\y^'^^) such that 
either 

p(fc)(x(fe),s(fc),yW) < inf{FW(X,s,y) : X G M'"^", s G K^, ||y||^ < p} + e^^) 
or 

VllCIII + llglli <T^'"\ for some (G,g) G ^jcsPC^) (.,.,. )|(x('^),s('=),y('')) and 

with/ii||fT(A:W)||a <?/'=) 
2 
3 

return (xW,sW,yW) 

Fig. 2.1. Outline of First- Order Augmented Lagrangian Algorithm (FALC) 

2. First-order Augmented Lagrangian Algorithm for Composite Norm Minimization. The 

linear maps A and C in (|l.ip can be represented as A{X) = Avec(A) and C{X) — Cvec(A), where 
A G M'^™" and C G ]Rp><™". By completing squares, it is easy to see that (|1.12p is equivalent to 

min fp(^-)(A,s,y)}, (2.1) 

where 

pW(A,.,y) = AW(^i|la(A)|U+A.2|l.s|l^) + /W(A,.,y), 

/W(A,.,2;) = ip(A) + y-&-AW0(^)||i + i||C(X) + ,s-d-AW0f||l. ^''"^ 

We denote the optimal solution of (11.11) by A* and, for all fc > 1, we denote the optimal solution of the 
augmented Lagrangian sub-problem (|2.1I) by (A* , s* ' ,y), '), i.e. (A* , s; ' ,yi ') = argmin{p(*'''(A, s, y) : 
AGR™^",sGKP,||y||^ < p}. 

The outline of Algorithm FALC is displayed in Figure 12.11 In Algorithm FALC we use Algorithm 3 
in [33] with the prox function h'^^\X,s) = \\\X - X'^^-^'^\\\ + \\\s - s^''-^^\\l + \\\y - y^^~'^^\\l and initial 
iterate {X''''~^\ s^^~^\y'^^~^^) to compute a new iterate (A^'^^s^'^^y^*^^) such that one of the following two 
stopping conditions hold: 

(a) pW(AW,s('=),yW) < min{p('='(A,s,y) : A G M"^",s G RP,y : ||y||-, < p} + e^^) 

W VWW+Mi < r'^'K for some (Cg) G ^x.^PC^H-, •, •)lfx('=),s(^),,W), mlk(^('=')IU < ^7^'^ (^.3) 
andp||V2,P('=)(AW,sW,y('^-))||^. + Vj.P^'^HA^'^), s('=),yW)^yW < ^('=) 

where IJ-H-y- denotes the dual norm of |j.||^, Sx.sP^'^H-i -7 •)l(x('«),s('=),a('=)) denotes the set of partial subgradients 
of the function P('^) at (A^'^), s('=),yW), ?? := /ii|jcr(A(o))||„ + ^2||C(A(o)) - d||/^ for some A (o) G R"""" such 
that yl(A(o)) = b and ryW := ry + ^^ (li^f ^lll + ll^'^^lli 
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Since f^'^\X,s,y) is a proper, lower-semicontinuous (Isc), convex function with a Lipschitz continuous 
gradient, V f^^\ the results in [33] guarantee that we can compute {X'^^\s^''\y^^^) in 0[-^=) operations. 
In Lemma 12.31 we prove a uniform bound on the number of operations needed to solve any sub-problem 
encountered in FALC. In the result below we establish that every limit point of the FALC iterate sequence 
{{X^^\ s^^\y^^')}ks:'i-+: is an optimal solution for the composite norm minimization problem. In order to 
compute bounds on the iterates, we need to introduce dual norms. The dual ||cr(.)|JQ;* of the matrix norm 
||(t(.)||q is defined as 

|la(X)|U- =max{(M/,X) : \\a{W)\U < I}- 

It is easy to establish that a* is the Holder conjugate of a, i.e. -\ + — = 1 (sec Proposition 2.1 in [31] for 
details). The dual norm of the vector norm ||a;||^ is clearly ||a;|j^., where (3* is the Holder conjugate of /3, i.e. 
^ + i = 1. Define 



J,, ^ J Vmin{m, n}, a = oo, j,„. ^ f Vp, /3 = oo, 

11, otherwise, 1 1, otherwise. 



Then, it is easy to show that 
1 



j^^,^ \WiX)\\a < \\X\\f < I{a)\\aiX)\U, J^^ll^^ll/J < ll^^lb < J^Mp- (2.4) 

Theorem 2.1. Let X ~ {X^^' : k G Z+j denote the sequence of iterates generated by the First-Order 
Augmented Lagrangian Algorithm (FALC) displayed in Figure [KT\ for a fixed sequence of parameters 
{A('=),e('=),rW,fW}fcgz such that 

(i) penalty multipliers, X-^i \ 0, and 

(ii) approximate optimality parameters, e^^> \, such that ,^{k)\2 1^ B for all k > 1 for some B > 0. 

(Hi) subgradient tolerance parameters, r'*^' \, and ^f'^' \, such that j^^ — >■ and jTkj ~^ as k -^ oo. 
Then X = {X^^' : k G Z_|_} is a bounded sequence and any limit point X of this sequence {X^'''}k^z+ is an 
optimal solution of the composite norm minimization problem hLl\] . 

Remark 2.1. The notation ^^ ' \, rj (resp. 7^ ' /^ rj) denotes that the sequence {7^ 'jfeez^ is mono- 
tonically decreasing (resp. increasing) . 

Proof. Due to Lemma 12.31 we are guaranteed that each inner loop terminates after a finite number of 
steps. Hence, {{X'^^\ s'^^\y'^^'>)}k^i^ sequence exists. 

First, we show that for all fc > 1, 

||gf ||2<maxL^,,(Af)W ^;:,_,.^^ , _-^^+J(/3*)^2, (2.5a) 




\\A*{ef^)\\F < \\C*{ef^)\\F + max <; a^aAM)^! ruk^u.2 ' Tn-u } + ^("*)/^i- (2-5b) 

Consider the following two cases. 

(a) the fc-th inner loop terminates with the iterate (X^'^^ s'^'^\ y'^'^-') satisfying (I2.3p fa'). Then Corollarv lA.2l 
guarantees that 



||C(X('=)) + sW -d- xC'^ei"^ II2 < V2^ a^a-AM) + J(/3*)A«/i2, (2.6a) 

\\A*{A{X^''^) + y(^) - b - A^^'ef ^) +C*(C(X(^)) + fiW -d- A^^^^f ^)||f 



< V2^a^aAM) + /(a*)A('=Vi- (2.6b) 

(b) the k-th inner loop terminates with an iterate (X'*^), s^''\ y''^') that satisfies (|2.3p (b'). Hence, there exists 

QC') e d\\a{.)\\c,\xm and gW e d\\.\\fi\,m such that 

^||A«/ziQ(fe)+Vx/('=H^'''^s^'=^y^''^)lll + ll^^''V2Q('=)+V,/W(A:W,sW,yW)||i < rW. 



Since llg^'^^H^g* < 1 and |lcr(Q(''))|U* < 1, from the definition of /(.) and J(.) in dm, it follows that 
lk(Q('=))|lF < Ha*) and \\q^'''>\\2 < JiP*)- Then we have 

||C(X('=)) + s^'^) - d - A^^f ^||2 < T^'^) + J(/3*)A('=V2, (2.7a) 

||yl*(yl(xW) + j/C--) - 6 - A('^)6'f ^) +C*(C(X('^-)) + s^*^) - d - AC^^e'f ))|1f 

< T('=)+/(a*)A('=Vi- (2.7b) 

Thus, from (gH) and (gT]), it follows that for all fc > 1 

||C(X«) + s(^-)-d-AW0("^||2< max{y2^a„„,(Af),TW}+J(/3*)AW/i2, (2.8a) 

P*(yl(xW) + yf'^' - b - X^''^e[''^)\\F < max{v/2^ a„„,(Af ), t^^')} 

+ ||C*(C(X('=)) + sW - d- A('=)0^''^)|1j^ + J(a*)A('=Vi- (2.8b) 

Since 0f +^) = 0('=) - ^^""'"^tf'-' and 4'^+^) = ^f > - c(x"-'H^.'-'-. ^ ,[^ ^u^^^ ^^.^^ ^^ ^^^ ^j^ 
Thus, {(^J ,^2 )}fcpz ^^tisfies (|2.5p . Because A has full row-rank, -^^^^^^.^ < Bi and ^(tj- — >■ 0, p.Sp implies 
that there exist iJg^ > and Bg^ > such that for all /c > 1 

||ef^||2<Sfl,, ll^f ||2<i?e.. (2.9) 

From ([2Jl) . it follows that for i = 1, 2, 

lim A^^^^f ) = 0, (2.10) 



and 



lim AC^'^lie/f^ll^^-O. (2.11) 



Also, ,wfc).2 ^ B for all fc > 1 implies that 



e 



(fc) 



lim T7^ = 0. (2.12) 



We next show that {{X^ ' , s^ ' ,y^ ')}kez+ is a bounded sequence. Consider the following two possibili- 
ties. 

(a) (X('=',s('=',y('=)) satisfies (l23|)(a). Recall that X^o) = argmin{||X||i? : A{X) = b}. Define s(°) = 
d - C(X(o)) and yfo) = fe - yl(X(o)) = 0. Then 

pW(xi'=>,sl'\yl'^) < pW(^("\sW,y(°)) = A^r;-!- i (||A«<)||^ -^ \\X^''>9i''>\\l) , 

where ?] := fii\\a{X(-"'>)\\a + fi2\\C{X^°^) - dy. Hence, 

^i||a(X^ ^)||a< ^p^ < ^^ <'n'' + ^, (2.13) 

where 77«=7y+4^||^f)|l2 + ||0('='||2. 

(b) (X('=),s('=),yW) satisfies d^Kb). Then trivially, Aii|k(X('=))|l„ < ??('''. Hence, from dUS]), we can 
conclude that for all fc > 1, /ii|lCT(xW)||„ <,^(fc) + li!L. Hence, ^i||(7(xW)||„ < t^ + A^*^) (?^i^^ + b\ 
for all fc > 1. 



Therefore, we can conclude that there exists a subsequence /C C Z+ such that MnikeK X'^^^ — X exists. 
Furthermore, p.6ap and (|2.7ap guarantee that hnifcgyc s'^^^ — s exists and similarly (I2.6bp and (I2.7bp guarantee 
that limfcg^c y*^*^-* = y exists. In the rest of the proof, we will show that X € argmin{/ii||o'(X)||Q, + /X2||C(X) — 
d\\fi : ll^(^) — &II7 < p}. We consider the following two cases. 

(a) There exists a further subsequence ICi C JC such that for all k e /Ci, (X^'^^s'^'^^ y^^^^) satisfies (|2.3p (a). 
i.e. the sequence {{X^''\s^''\y''''^)}kez+ computed in Step [1] of FALC satisfies 

0<pW(xW,sW,yW)-pW(xi'=\sl'\yi'^^)<e("^ Vfc > 1. (2.14) 

Let X* denote any optimal solution of the composite norm minimization problem and let s* — d — C{X^) 
and y* = b - AiX,). Since {xi''\ s^J'\ yi^^ ) = argminjpW (X, s, y) : X € R"><",s G W, \\y\\^ < 

p}, for fc > 1, it follows that P^'''> {xi''\ si''\ yi''^ ) < P'^^\X.,,s^,y^). Thus, (I^H)) implies that 
p(fe)(X('=),s('=),yW) <pW(X*,s*,y*) +£<'=). Hence, for ah fc > 1, 

p{k)tx(k) gW y(k)\ 

^ pW(X,,s„y,) + £W 

= Mi|k(^*)IU + M2||C(X.) - d||, + ^- (ll^'ll^ + ll^f 11^) + ^. (2.15) 

Taking the limit of both sides of (|2.15p along the subsequence /Ci, and using the fact that s = d — C(X), 
we get 

Pi\\a{X)\\^ + ^i2\\C{X) -d\\p= lim /ii||a(X('=))||„ + p2h^''^\\p 



< Mi|k(X,)|U + M2||C(X,) - d||, + ^lim 1^ (ll^f )|12 + ll^^ll^) + 1^1 
- /ii||a(X,)||„ + ^i2\\CiX,) - d||^, (2.16) 



where (|2.16p follows from the fact that {6l '} is uniformly bounded for i = 1, 2, A*^'^' — >■ 0, and e'^'^'-'/A'^'^^' — > 
0. Taking the limit of both sides of (I2.8bp of along JCi and using (I2.10p . we get 

\\A*{A{X)+y-b)\\F<0, (2.17) 

and since A has full row rank, it follows that A{X) + y = b. Since Hy^'^-'H-y < p for all fc > 1, we 
can conclude that X is feasible, i.e. ||yl(X) — b\\~f < p. Thus, from (|2.16l) and the fact that X^, E 
argmin{/ii||cr(X)||Q +p2\\C{Xt) — dWp : ||y^(X) — b\\^ < p}, it follows that X is an optimal solution for 
the composite norm minimization problem ()l.ip . 
(b) There exists K E K, such that, for all k E IC2 = /Cn{fc > K}, the inner iterations for the fc-th subproblem 
terminates with an iterate (X('^\ s''^-',y'''-') that satisfies (12.31) (bV 

For ah fc E IC2, there exist Q^*^) E a||cr(.)||a|x('=) and q('=) € 5||.||/3|^(fc) such that (l23])(b) holds. Hence, 
we have 

||AWm2<?^'^ + V,/W(^^'\sW,yW)|i2 < rW, (2.18a) 

\\\'^''^^ilQ'^''^ +Vxf'^''Hx'^''\s^''\y'^''^)\\F<T(''\ (2.18b) 

p||V,/W(xW,s«,yW)||^. + V,jW(xW,sW,yW)^yW < ^W. (2.18c) 

Since H^f ^||2 < Be,, for all fc > 1, and limfe^oo A^^^flf ^ = for i e {1,2}. Taking the limit of both sides 
of (j2.7ap for fc E /C2, we have ||C(X) + s — d)\\2 < 0, i.e. s — d — C{X). Moreover, taking the limit of 
both sides of (|2.7bp for fc e /C and using the fact that s ^ d- C{X), we have \\A*{A{X) +y~ b)\\2 < 0. 
Since A as full row rank, it follows that 

AiX) + y = b, \\y\\f<P- (2.19) 



For all k e JC2, Q^"^ e d\\a{.)\\o.\xm and gW e 91]. |1^|,(.) , therefore, ||a(Q('=))JU. < 1 and Wq^'^m^* < 1. 
Hence, there exists a subsequence /C3 C IC2 such that \iink£]C3{Q^''\ q'"'''^) — {QtQ) exists. One can easily 
show that Q € 9||(t(.)|Jq|x and q € 9|j.|j/3|g. Dividing both sides of (I2.18ap by A^"), we get 

(k) 

\\^^2q^'^-ei'^'^h<^, (2.20) 

for all k £ JC2 D IC3. Since linifegyca 9 = q and linifegz+ -jjkj = 0, it follows that linifeg^Ca ^2 — (^2 
exists and taking the limit of both sides of (|2.20p , we have 

Dividing both sides of (|2.18bp by A^*^) , we get 

(k) 

y,Q('^)-A*{9[''+'^)-C*{ei'+''>)\\F<^, (2.21) 

for all fc e /C2 3 A^3- Since linifcg^Cs Q*-'^-' = Q: linifcgz+ Tjk) — and A has full row rank, it follows that 
\im.k(£K.3 ^1 — ^1 exists and taking the limit of both sides of (|2.2ip . we have fiiQ — fj,2C* (q) — A*{9i). 
Note that q e (9||.||^|s and s = d — C{X). Hence, —C*{q) E d\\d — C{.)\\p\x and we have 

X(^i) = a G,e^^il\\a{.)\\a,+^i2\\d-C{.)y\x, (2.22) 

where G* := fiiQ — fi2C*{q). Dividing both sides of (I2.18cp by A''^^ we get 

p||<+^)||,.-(^r^))%W<l|, (2.23) 

for all fc e /C2 ^ A^3- Since lim^g^Cs ^1 = Si: taking the limit of both sides of ()2.23p and multiplying 
by —1, we have 

0<-p\\0i\U*+{0ify^ min ^{0,f(y-y). 

y-\\y\\-r<p 

Thus, 

-{0if{y~y)>O yy-\\yh<P- (2.24) 

Consequently, p.22p and ()2.24p together imply that {X,y) satisfies the first order optimality conditions 
of the relaxed problem (j2.25p . 



mm 



[^ll\\aiX)\U + ^l2\\CiX)-dy-i^lf{A{X) + y-b)■. ||y||^<p}. (2.25) 



Since p.25p is convex, it follows that {X,y) is an optimal solution to the relaxed problem p.25p . 

Moreover, from (|2.19p . {X , y) is feasible to the composite norm minimization problem, i.e. min{/ii||X||Q+ 

H2\\d-C{X)\\fi : A{X)+y = b,\\y\\^ < p}. Therefore, X e argmin{^i||X||„ + ^^2^ - C(X)||^ : 

\\A{X)-b\\^<p}. 
D 

For compressed sensing and matrix completion problems exact recovery occurs only when mina;gRii{||a;|ji : 
Ax — b} and minxgMmx7i{||X||* : Xij — Mij {i,j) G ^} have both unique solutions, respectively. The 
following Corollary establishes that FALC converges to this solution. 

Corollary 2.2. Suppose the composite norm minimization problem (jl.ip has a unique optimal solution 
X^,. Let {X^'^' : k € Z+} denote the sequence of iterates generated by the First-Order Augmented Lagrangian 
Algorithm (FALC) displayed in Figure \2J\ when the sequence of {{\^^\e^^\T^^' ,^^'^'y}k£z satisfies all the 
conditions in ()2.ip . Then limfe_yoo -'f = X^, where X^, — argmin_^ggmx7i{/ii||cr(X)||a + P'2\\C{X) — d\\ij : 
\\A{X)-b\\^<p). 
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We next establish a bound on the complexity of computing the iterate (X '^'^^ , s*^*^^ , y'^'^^ ) using Algorithm 3 
in [33]. 

Lemma 2.3. For all k>l, the worst-case complexity of computing {X^'^' , s^''' ,y^''') is 



min{nTO^, n^m} 



O ( ^^ ) , (2.26) 



when A^'^' -^ 0, e''^^ -^ such that T^jtyw < B for all k > I. 

Proof. Let {{X^'''^\ s'^'^'^\ y'-'^'^'')}^ez+ denote the iterates computed by Algorithm 3 in [33l when applied 
to the k-th sub-problem 

min{pW(X,s,y) = AW(/ii||a(X)|U + ^aPH^) + /W(X, s, y) : X G R™><",s £ RM|2/||^ < p}, 

with the prox function h'^^\X,s,y) = \\\X - X'-^-^'^W], + \\\s - sC^-i)!!^ + i|jy - y^*^^!)]!^ and initial 
iterate {X'^^-'^\s'^^-'^\y'^^-'^'^) e K™^" x M^ x Ri. Then Corollary 3 in [33 establishes that for aU iterates 

^> / 4LhW(xi''\si''\yi''^) ^ 

p(fe)(j^(fe,^)^s(fe,^)^y(fc/)) < ini{p(^\x,s,y) : X G R"^",.s G MP,||2/||^ < p} + e^'^^ 
where (xi'=\ sl'^\2/l'^) = argmin{pW(X, s,y) : X e R"><",,s G MM|y||^ < p}, L ^ al^^ f^ ° ^) is the 



Lipschitz constant of Vf^'^'f for all fc > 1. 

Let X^, denote the optimal solution of (|l.ip . s* = d— C(X*) and j/* = 6— yl(X*). Then we have ||y*||7 < p. 
Since (X*,s*,y*) is feasible to the k-th subproblem, we have P'^''\Xi ,si ,yi ) < P'^''\X^,s^,y^), which 
implies 

t^iMxi'^)\\a + m\\si''^h < Milk(^*)IU + /i2||c(x,) ~d\\p + ^ (ii^f^ii^ + lief ||2) . 

From (|2.15p . it follows that the inexact minimizer of {k — l)-th subproblem {X^''^^\ s*^*^^^-', y^*^^^-*) satisfies 

f,,\\aiX('^-'y)\\^ + ^,,\\s('^-'^\\,,<^^,\\aiX^)\U + ^,2\\C{X^)-d\\, + ^^ 

Since 6[ < Bg-^ and 6*2 < Bg.^ for all fc > 1 -see (|2.9p . and {A'''^^},p2 i^ a decreasing sequence, it follows 
that 

< 2 (Mi||a(X,)|U + M2||C(X,) - d\\^) + X^'-'HbI + BlJ + y-^. (2.27) 

From the definition of h^''^{-) it follows that 

/iW(xi'=\4'\2/l'^) 

+ 2pV2(7), (2.28) 
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where ((2:28| follows from the fact that \\yi''''\\ < p, Hy^'^^^^ll < p and ((2:27| together with the definition of 
/(.) and J(.) in ([231). Thus, 



Jk) - Ak) 



where 



Ml M2 / \ ^ 



^ = \ 8 -V^ + -F Ml|k(X.)|U + P2\\C{X.) - d\\p + — {Bl + 52 + S) + p2 J2(^). 



Each step of Algorithm 3 in |33 involves computing the solution of the following optimization problems. 

(a) one matrix shrinkage problem of the form 

min {5x\\cT{X)\U + ]-\\X-Y\\l:\\a{X)\U<px) (2.29) 

Lemma [A. 41 establishes when a G {l,oo} the worst-case complexity of computing a solution to (j2.29p 
is the same that of computing a full SVD, i.e. 0{nim{nm^,n^m}), and when a ~ 2, the worst-case 
complexity is 0{mn). 

(b) one vector shrinkage problem of the form 

min {52\\sh + \h -iWl-- hh < P2} (2.30) 

for a given q gW . Lemma [A. 41 establishes that the complexity of solving the vector shrinkage problem 
is 0{p\og{p)) when (3 e {l,oo} and 0{p) when /3 = 2. 

(c) one vector shrinkage problem of the form 

mm{i||2;-z||M|y||^<P3} (2.31) 

for a given z G M''. Lemma [A. 41 establishes that the complexity of solving the vector shrinkage problem 

is 0{q\og{q)) when 7 e {l,oo} and 0{q) when 7 = 2. 
D 

Next, we characterize the finite iteration performance of FALC. This analysis will lead to a convergence 
rate result in Theorem 12.51 

Theorem 2.4. Let {{X^^\ s''^\y^^^)}k^z^ denote the sequence of iterates generated by the FALC dis- 
played in Figure \2J\ Suppose there exists B > such that (^(k)\2 ^ B, t^^i = nie-^' and ^^^' = K2e for 



all k > 1 so that lim ytet ~^ and lim f/jj — > as fc — > 00. Then there exists Ci > 0, C2 > and C3 > such 



that for all k > 1, 

(i) \\y'''"^U <P such that WAiX'-'^y) + yC'^ - bh < ciA^, 

(it) |(Milk(x«)|u + M2||c(xW) _ d\\p) - {p,\\a{x,)\\o. + P2\\c{x,) - dy)\ < C2AW + csV^, 

where X^ = argmin^gjj„xn{^i||(T(X)||Q -|-/X2||C(X) - d||^ : A{X) = b}. 

Proof. Given fc > 1, consider the following two cases: 
(a) (X('=),s('=),y('=)) satisfies l^^{a.). Then, from (|2J5)) it follows that 

Mi|k(X(^-))|U-hA^2pWb< piMX,)\U+P2\\C{X,)-d\\0 
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(b) (XC^), sC^), yC')) satisfies (I131)(b). Then from the convexity of P^^^, it fonows that 

p(fe)(X«,sW,yW) 

< Pf'^^x,, s,) + IIGIIf ||X, - X^llf + ||g||2||.s, - sW||2 - min V,pW(xW, s«, y^f (y - y^), 

y:||ylk<p 

<pW(X„s,)+tW(||X,-XW||,. + ||,s, -sW||2)+V,pW(xW,sW,yW)V^ 
+p||V,pW(xW,sW,yW)||^., 

< P('=)(X*,S*) +t('^-)(||X* - Xf'^^llF + ||s* - s^'^'lla) +C"'^ (2.33) 

where {G,g) £ 9P('^^(., .)|(x('=),s('=)) and 9P('^^(., .)|(x('=).s('=)) denotes the set of subgradients of the func- 
tion PC^') at (X('=),s('=)). Hence, it follows that 

A.i|k(xW)|U + A*2||.s('=^||;3 < Milk(x,)|U + ^^2\\cix^) d\\, + ^ [wei'^wl + ll^f II2) 

T-(fc) f(fe) 

+ ^(^(||X, - X«||^ + \\s. - 5WII2) + |(^, (2.34) 

Thus, from ([232]) and ^^M^ . it follows that for ah A: > 1 

/.i||a(xW)|U + M2P('='||^ < Milk(^*)ll« + A^2||C(X,) - d||^ + (^k±^\ A^ 

where (|2.35p follows from the bound on ||6', ||2 for i e {1, 2} established in ()2.9p . From p.4p and Corollary 
IA.2l in Appendix A, it follows that for all fc > 1, 

||C(xW) + sW-rf-AW0('='||^ < J(/3*) ||C(X(^-)) + sW-d-A(*)0^'=^||2 

< J(/3*) (y2^a„a,(Af) + J(/3*)^2A('^)) . (2.36) 

Triangular inequality and the uniform bound ||6'2 II2 < Bg^, for all k > 1, established in (|2.9p . together 
imply that 



||C(X«)-d||^<||sW||0 + ||A«0f ||0 + J(/3*)(v/2^a,„„,(M) + J(/3*)/i2AW' 

< |ls(^)||^ + J(/3*) (Be. + M2 J{P*)) A*'^) + (\/2 a„,a,(A'f) J(/3*)) V^. (2.37) 

Thus, ((05t and ((237| together imply that 

/ii||a(xW)|U + M2||C(XW) - rf||^ < ^i||a(X,)|l„ + fi2\\C{X,) - dh 

(Bl + Bl ^ ^^j^^^^ ^^^^ ^ ^^ j^\ ^(,) 



-|- max 






+ fi2 (V2 a„„,(Af) J(/3*)) V^. (2.38) 

Since {(X^*^), s('=),y('=))}fegz+ is a bounded sequence, r^*^) = ^le^*^) and ^('^^ = K2e^''^ for aU A: > 1, (1^51) 
implies one side of the bound in ([n]) . 
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For all fc > 1, 

II^(XW) + y W - 6II2 < UiX'-'^y) + y('=) - b - A«0f ^ II2 + AW 11^^'=^ II2, 
= AW||0('=+^)|l2 + A«|10f'|l2, 
< 2Be,AW, 

where the last inequality follows the fact that H^^ II2 < Bg-^ for all fc > 1; see (|2.9p for details. This 
establishes (0). 

Next, we establish a lower bound for P^'''>{X^ ,sl,yi ) using the following pair of Lagrangian duals 

minxeR^x^ ^ii\\a{X)\\a + H2\\C{X) - dWp, (2.39a) 

s.t. \\A{X) - b\\.f < p. 

maxt„gR<,_^gRp b'^w + (f'v - p\\w\\^*, (2.39b) 

s.t. ||a(^*(w)+C*(w))|U. </ii, 

ll«ll/3* < Ai2- 

Let {w^,,v^,) denote the optimal solution of the dual (I2.39b[) . Let f{X,s,y) — ^\\A{X) + y — b — A6'i||2 + 
^||C(X) + s — d — A6'2||2- Moreover, (|2.40ap and (j2.40bp below are also a Lagrange primal-dual pair of 
problems. 

min^gR^x^^sGRp.veR-j HfJ-i\W{^)\\a + fJ-2\\s\\l3) + fi^, s,y), (2.40a) 

s.t. ||y||7<p. 



max„gR<,,„gRp X{b + X0i)'^w + X{d + X92)^v - ^(||w||i + \\v\\l) - Xp\\w\\-y^ , (2.40b) 

s.t. \\<j{A*{w)+C*{v))\\^,<p,, 

\Hp' </i2- 

Since (w*,w*) is feasible for (I2.40bp . it follows that 

pW(X«,^\y('=)) 

'^Hf,,\\aiX)\\^ + ^i2\\sh) + f^''Hx..s,y) : \\yh < p)} 



mm 



> A^'^) (b^w, + (fv, ~ p\\wA\t - ^ (11^*112 + 11^*112 - "^{Of^fw* - ^{e^^^fv*)) , (2.41) 

>x''''H^il\W{x^,)\U + P2\\c{x^.)-d\\p) 

~^^^(\\n^\l + \\vA\l + n6'i\\2\\w42 + nof^h\\v42), (2.42) 

where (I2.4ip follows from weak duality for primal-dual pair in (|2.40p , and (|2.42p follows from strong duality 
for primal-dual pair in (|2.39p and the Cauchy-Schwartz inequality. 
Since the FALC iterates {X'^''^}kez satisfy 

pW(xW,.W,yW ) , „ ^^^,,^ (,^„ , AW 

AC^) 

it follows that 



= (^,||a(X(^))|U +M2||.«||,) + V (ll^i'^'^ll^ + ll^^'^'^llO ' 



,, \\rrlY(f')\\\ -U„IU(fc)|l ^ P^'^KX^^ ,S^ ,y* ) '^''''^ f\\n{k+l)u2 .un{k+l)u2\ (r, .o\ 

^J■l\\cr[X'^ ')\\a+ ^J■2\\s'■ '\\i3 > -^^ 2~ v'^ II2 + F2 h) ■ (2-43) 
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Thus, the bound on |16'f ^||2, i G {1, 2} estabhshed in ([2J)) . and the inequahties ([2?42)) and ((2^ . together 
imply that 

/ii||a(xW)|U + M2|ls('^ll/J > Milk(^*)IU + ^^2\\C{X,) - d||^ 

\(fe) 

- ^- ((S«. + Ih^^*ll2)' + (S.. + 11^^*112)') . (2.44) 

The bound \\A*iw^) + C*{v^)\\f < Iia*)\\a{A* (w*) + C*{v^))\\a' < I{a*)fii implies that 

0-min(^)||w*||2 < \\A*{w*)\\f < I{a*)ni + ||C*(v*)||_F < I{a*)Hl +Crmax(C)||w*||2, 



(fc)l 



and the bound ||f*||/3» < /^2 implies that ||w*||2 < J{P*) M2- 

From (|2.36p . triangular inequality, and the uniform bound 11^2'"'' II 2 ^ Be^, for all fc > 1, established in 
(USD, it follows that 

||s«||^ < ||C(X«) ~d\\p + J{(3*) {Be, + fi2 J(/3*)) A« + (^2 ama.{M) J(/3*)) V^. (2.45) 

From (1^^ and (piiSl) . it follows that 

/.i||a(xW)|U + A^2||C(XW) _ d||^ > ;ii||a(X,)||„ + Ai2||C(X,) - d\\f, 

- ^X2 [V2amacciM) J(/3*)) \/^ . 

This establishes the result. D 

Theorem 2.5. Fix < j^ < 1, and strictly positive parameters {X^^' ,e^^\T^^' ,^^^'). Then there exists 
a sequence of parameters {(A*^'^-', e*^'^',T'-'^'',^''^))}fegz+ such that for all e > 0, Algorithm FALC displayed in 
Figure f^TTl computes an e-feasible and e-optimal solution X € M™^" to problem, (jl.ip . i.e. for some y ^W 
such that \\y\\j < p, we have 



\\A{X) + y~bh< 



|(/ii||a(;^)|U + li2\\C{X) - d\\p) - {^ll\\a{X,)\\^ + M2||C(X,) - d||^)| < e. 



in 0{\\ operations. 

Proof. Fix A^^^ > 0, e'^^ > and choose < i^ < 1 and update the parameters as follows: for all fc > 1, 



g(fc+l) ^ j,2 g(fc) ^(fe) ^ 



1 Ak) 



i(Bx+p) 



f{k) 



(2.46) 



where max{||X*||i., ||X('=)||f} < Bx for all A: > 1. For this specific choice of {(A^, eW,rW,CW)}fcgz 



sequence, we have 



~ /■wi)\2 for all /c > f. Hence, setting B — 73^1772, Theorem 12.41 guarantees there 



(A('=))2 ~ (A(i)) 



(A(i))2 



exist C2 > and C3 > such that for all fc > 1, 

(Aiilla(xW)IU + M2||C(X«) - d||^) - (Mi|k(X,)|U + Ai2||C(X,) - rf||^) 



< 



(c2A(i) 



csVe'- 



(1) ] u'^-K 



(2.47) 



Thus, 



(Milk(xW)IU + A*2|1C(X«) - dll^) - (Aii|k(X,)|U + M2|1C(X,) - d||^) 
for all fc e Z+ such that 

In 



<e, 



fc> 






(2.48) 
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Moreover, Theorem 12.41 also implies that there exists ci > such that for /c > 1, 

Thus, ^(XC^)) + yC^) - 6II2 < e for aU fc e Z+ such that 

In^^^' 



fc> ln(l) +^- ^^-^^^ 

Let A^FALC denote the number of FALC iterations required to compute an e-feasible and e-optimal solution. 
Let 

C/ = max{c2A(i)+C3\/^, ciX^^A . 
Then ((2:481) and ([2:49)) imply that for all e > 0, 

In(^) 
A^FALC < TTJT + 1- (2-50) 

and from Lemma 12.31 it follows that iVpALC FALC iterations require at most 



/A^FALC • / „^2 2 



niiii{nm^, n^m} 



^eC^) 



operations. Since 



it follows that 



WfALC I—. -. AfpALC ,,-ArFALi 

^-^ V eC') ^/7TT ^^ (1 - i^ 



(l-i.) v^' 



^ ^ (' 'T^W"™':'-^'"^} . , -WfalC 



"^ - ' /^ 



I' 



From (|2.50p it follows that for all e > an e-feasible and e-optimal solution can be computed in at most 






operations. D 

3. Implementation Details of Algorithm FALC. In this section we describe all the details of 
FALC. The implementable version of FALC for solving problem (|l.ip with ||o'(.)||q and |1.||;3 denoting the 
nuclear and £1 norms, respectively, is shown in Figure [3^1 

3.1. Bounds on the iterates {X''''^kei.+ - Let X(°) = argmin_YgM"x™{ll-^llF : A{X) = b}, s(°) = 
d — C{X^'^') and y^''^' = 0, where ^ is a surjective linear map. Computing X^^' requires a projection onto 
the affine space {X e M"><" : A{X) = b}. Let (xi''^ sl''\yf ') = argmin{p('=)(X, s,?/) : X € R™^",s G 
K^JIj/IU < p}- Since y^(A:(")) = b, s^"' = d-C{X'^°'>) and f^\X,s,y) > for ah X e R'"^^", sGW and 
y e R9, it follows that 

/ii|k(xi'=))||„ + ^2Pl')||, < Milk(^(°))IU + f^2\\s^'% + ^ (ll^f ^11^ + ll^f 11^) ■ (3.1) 

Let r] := fii\\a{X'^"'))\\a + A*2||s^°^|l/3 and for each fc > 1, 7?W := ^ (||6'f^||i + ll^'f ^||i)- We inexactly 
minimize P^''' over the set 

[xeR^''",seW,yeR''■.^il\\aiX)\U<V^''\ \\yh < p} , (3.2) 

to ensure that the {X^'^^keZ-i- sequence always remains bounded, which also implies that {s^''^kez+ sequence 
also remains bounded (see (|2.45p ). 
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First-Order Augmented Lagrangian Algorithm ({(A^''\e'^''\T(''\^(''^)}^g^ ) 

L^a^„^(M), X<-°^ ^'Argmm{\\X\\p\A{X)=b}, s^ ^ rf - C(X(o)), yW ^ 0, t(") ^ oo, fc^O 
while (fc > 0) 
do 

A:^ fc + 1 

INITIALIZEO 

^^0, Si^O, Sa^O, Sg^O, i9(") ^ 1 

repeat 



^3 '-^a '^3 



y ^y , V./"°)(xf"^'4'°-^',y<'-^') 
^2 <- ^2 H ^5l7] 

^3 ^ ^3 ^ ^(i) 

[U,D,V] = swd(X('=^o) - fi), 



d = dza9(i?), d„ = n,, (d, T^Sw' IIT ) ' '^+ = (^ - T™^ 



1 Xf ''+^) ^ Udiag(d^)V^, If ■'+^) ^ Udiag{d+)V^ 



3 yf'+'Vargmin{||y-(s('=>")-fa)||2: ||y||^ < p} 

6 g ^ argmin{||t;||2 : v ^ A^^^^P + V,/W(xf ■^+^\ 4'^'^+^\ yf '^+i^p G a|l.|U I <.,.+i,} 



^(^+1) ^ V(^('))^-4(tfW)^-(^'^')^ 



T 



►7 ^ llw j-r*-W T^(fe-^+l) (fc,^+l) {k,t+l)\ u , „ jy(k\ I v(k,t+l) (k,l+l) (kj+l)\ (k.i+1) 



if ||xf '^+1) - Xf '^'IIp < g and pf^^+i^ - 4'=^^)||2 < g 
then 

(Aso;,Sso;j <— (,^1 ,Si j 

return (Xso;,Sso/) 
if {£ == 0) 
then 

(fc) • r - (fc-1) ii^ii 1 (fc) • r - (fe-1) II II 1 

Tjj-' <- mini CrTJf ', c^||G||f}, t^ ^ min{ c^r; % c^||5||2| 

^(fc) ^_ min{ cj^^'^~"^\ c^(l)} 

e^e + 1 

until (£ > iVW) or f ||G||f < rf ^ and II5II2 < rf ^ and </) < ^C^) 

if (||G||f < rf ^ and ||g||2 < rf ^ and </) < ^^'^^ 
then 

(xW,.W,yW)^(x("\4'^'^\yf-^)) 
else 

/)(*=+!) , nik) ^(X(''))+jy"''-fc 

/)(*=+!) , /i(fc) _ C(X(''))+s"=)-rf 
''2 ^ "^2 \(k) 

Fig. 3.1. Details of the First-Order Augmented Lagrangian Algorithm (FALC) 
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INITIALIZE 

/W(X, s, y) = \\\A{X) + y-b- A«0f ) \\l + i||C(X) + s-d- A^^'^^H 
/3W ^ ^i||X(")|U+ ^,||.W|| i + 41 (ll^f^lli + lief 111) 



,(fc,0) ^ ^(fc_i)^ ^(fc,0) ^ ^(fe_i)^ ^(fc,0) ^ ^(fc_i) 



y^VyC^-i), yfo)^^^^-!), yf^VyC^-i) 



Fig. 3.2. Details of INITIALIZE subroutine 



3.2. Subgradient selection. In order to check the stopping condition (I2.3l) (b). in hne[S]and hnelHlof 

Figure Ism we compute a subgradient {G,g) G dx.sP'^'^H-: ■: ■)\rY(i',e+i) (k,e+i) (k,i+i)^ suchthatG = A('^ViQ+ 
V^^(.)(;^(/c,.+i)^^(/c,.+i)^^(.,£+i)^ and g = A(^)/i2g + V,/W(xf '^+1), 4'='^+i),yf ^^+^)), where 



q = argmin 






where Xi ' is defined in (|3.6p . It can be easily shown that Q G 9||cr(.)||a| (fc,^+i), and, given V^/^^^ at 

(Xf '^+^\ s^'''^+^\ yf '^+^^) the complexity of computing q e d\\.\\B\ (k.^+D C W is 0(p) when /3 e {1, 2} and 
0(plog(p)) when l3 — 00. 

3.3. Details of inner iterations. We now discuss the details of Algorithm 3 [33] for solving 

min A(Mi|k(X)|U + /.2P|U) + /(X,s,y), (3.3) 

XeR'"X",seRp,llyllT<p 

where / is a proper, lower semicontinuous (Isc), convex function and has a Lipschitz continuous gradient V/ 
with constant L with respect to norm |1.|1 on R™^" x R^ x R« such that for any X e M"''", s S R^ and 
yeR«, 

\\iX,s,y)\\ = ^/m^+Ml + Ml- (3.4) 

Let (X*,s*,y*) =argmin_,fg][j„x„^sgRpj[j^ll^<pA(/^i||(T(A)||a + ^t2||s||/j) + /(A, s, y) and let 7? denote any upper 
bound ||ct(X*)||q, of the unconstrained optimal solution A*. Algorithm 3 computes three sets of iterates for 
all the variables {x['\x^'\x!,'^), (sf\4'\sf ) and (yf \ yf , yf ) : 

1. (A3 , S3 , 2/3 ) is a convex combination of (Af , s]^ , y} ) and (Aj , S2 j 2/2 )• 

Xf = (l-i?W)Af)+^W4'\ 
sf = (l-^W)sf)+^Wsf, 
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2. (X 



(i+i) U+i) ^U+i) 



2/2 



') is computed using the gradients V/(X 



(i) J») „(') 



3 ' ''3 ' 



j/g ) for all the iterates i < t. 



X. 



it+i) 



argmm < N 

X:||^(X)|U<r, l^ 



(Vx/(xf,4^y«), x) + AMi|k(X)|U i 



1 



argmm 

X:||<t(X)|U<,, 



X 



X 



(0) 






X-X 



(0)||2 

1 llFr' 



i=0 



^W 



E 






kWII 



(3.5) 



Xj is the unconstrained solution of the problem p.Sp . i.e. 



X, 



(^+1) 



argmm < — 
X 2 



X 



X 



(0) 






i=0 



t?(* 



E 






k(^)ll 



(3.6) 



and X2 is used for computing subgradients as explained in Section 
Similarly, 



„(^+i) 



argmm <; ^ -^^ 

i=0 



fii.s-.sni^ 



argmm < 



1 



■E 

i=0 



i9(') 



E 



A;i^2 



|s||/? 



(3.7) 



and 



Vi 



(«+i) 



E 



argmm 

v-\\y\\i<p I j=o 



argmm • 

y-\\y\\-,<p 



(v,/(X«,4^y«), y) 



i, 



,(0)||2 



i9(* 



i^wy-vi 112 



(0) 



1 ^ 

-is 



w .« „w> 



4=0 



^yf{X^^\st>,y^') 




(3.8) 



Thus, in Step [T] and Step [2] of Figure [3?T] we compute the X2 and $2 iterates, respectively. 



3. {XI 



(i+i) (t+i) (i+i) 



i/2 



y{ •') is a convex combination of (X} \s\ ,y{ ) and (X 

x['+'^ = {i-dW)xi'^ + {)Wxt'\ 

yr) = (l-^W)yf)+^Wyr+^\ 
In order to solve p.5p and p.6p we need to compute the SVD of an appropriately defined matrix. The 
iteration description above implicitly assumed that we need to compute this SVD exactly. This is not 
necessary - inexactly computing the SVD adds a small additional error term. 

3.4. Stopping criterion for FALC. In our numerical experiments, we terminate either the distance 
between successive inner iterates are below a threshold g for each component, i.e. ||X} — X} ' \\f < 0, 



Jk.i) 



Sk.t-l)\ 



0-^ — oi ii2<6'or there exist partial subgradients with sufficiently small norm for each component, 

i.e. \\G\\f < 'ix, llfflU < Ts for some {G,g) G dx,sP'-''H-^ -^ ■)\ix'-''-''' ,(*=■*' ,S''-'h ^^^^ 

p|lV,p('=)(xf ■^\4'='^\yf'^))||,. + V,p('=)(xf'^\4^-'^\2;f'^))^y('=) < ,y. 

In our numerical experiments we set g, <;x , <?s and <^y by experimenting with a small instance of the problem. 
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3.5. Multiplier selection. Given Cr S (0, 1), cj S (0, 1), cx > 0, Ct- € (0, 1), cj e (0, 1), cx G (0, 1), for 
all A; > 1 the approximate optimality parameters t''^\ ^^^^ and the penalty parameter A*^*^^ are set as follows: 



4 
Ad 

A(fc 



= argmin^,K"-x„||Z-(x(0)-J-Vx/W(^(°',^(°',2/(°^))lll + 4Jf^lk(^)IU, 

= argmin,,«, Hz - (s^ - J-V./W(X(o), .("), y^)) \\l + ^||.|k 

= argmin„^ll„ll^<J|«- (y(0) - J-V,/«(X("), .("), yW)) ||i, ' 

- argmin{||«||2 If = A(i)/^2P + V./(i)(Z(i),zW, «(!)), p e 9||.||^|,a,}, 

- caIIX^IU, 

= argmin^,K„><„||Z-(x('=-i)-^Vx/('^H^('-'\s^'=-'\2/^'^-'0)lll + ^lk(^)IU, 

= argmin,,«, ||z - (.sC^-D - ^V./WlxC^-D, .('=-i),y('=-i))) Hi + ^\\z\\p, 

= argmin„^„„ll^<^ ||« - (y^^-D - ^V./WCXC^-D, s^^-D, yC^-D)) ||i, 

= argmin{||«||2:f''-A(^V2P + VjW(zW,zW,z;W), p G a||.||^|,(.,}, 

= min{c.4'-^\c.||GW||f }, 

= min{c.ri'=-^\c.||5('=)||2}, 

= minjc^ef^-i), (p||Vj,/W(Z«,z(^-),w(^-))||^. +Vj,/('=)(Z«,zW,vW)^w(^-))} 

= caA(^-i), 



(3.9) 



for all k > 2. In all our experiments, c^ = 0.999 and c^ = 0.999. 

We initialize FALC with {X^°\s^^'>) such that A{X^°'>) = 6 and s^o) =d- C{X^°^). In first iteration of 
FALC, we solve the problem 

min p(i)(X,s)= min A(i)(A.i||a(X)|U + a*2||.s||^) + /(^H^, s), 

\\(7{X)\\a<:ri^' \\(T{X)\\a<ri^-^) 

where r/(i) = ^i||ct(X(o))|U + ^zlls^^^b- Since A(") is feasible, /(i)(A(o), s^) = and p(i)(X(o), s^) = 
A(i),^(i) and p(i)(X) > for all X € M™^", the initial duality gap is less than or equal to A'^^^^i). Hence, 
we initialize e'^) = 0.99A(i)7?(i) and then set e^'^+i) = c^eC^) for aU fc > 1. 

4. Numerical experiments. We conducted two sets of numerical experiments with FALC. In the 
first set of experiments we solved a set of randomly generated instances of principle component pursuit 
problems (|1.5p . In this setting, we compare FALC with other augmented Lagrangian algorithms I-ALM [25) . 
APG [5B] and soft-thresholding algorithm SVT [3]. In the second set of experiments, we solved a set of 
randomly generated instances of stable principle component pursuit problem (jl.6p using only FALC. In 
Section [4. 11 we describe the methodology we have used in both experimental settings for generating random 
problem instances. 

4.1. Data generation. We tested FALC on randomly generated data matrices D = Xq + 5*0 + 10, 
where 

i. Xo = UV'^, such that U e M"^^ V e M"^'' for r = 0.05n and U^J - 7V(0, 1), V^j - 7V(0, 1) for all ij 

are independent standard Gaussian variables, 
ii. A C {{i,j) '■ i < i,j < n} such that cardinality of A, |A| = p for p = 0.05n^, 
iii. {So)ij ^U[—l, 1] for all (i,j) € A are independent uniform random variables between —1 and 1, 
iv. {Yo)ij ^ pW[— 1, 1] for all i,j are independent Gaussian variables. 

4.2. Principle Component Pursuit Problem. In this section we solve the problem 

minx,seR"X" ll-'^ll* + A*2|| vec(5)||i, , , 

subject to X + S = D, *- '' 
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and report the results of our numerical experiments comparing FALC with I-ALM [2S], APG [25] and SVT [3]. 

All the codes for I-ALM, APG and SVT, can be found at 'http : //perception . csl . uiuc . edu/matrix-rank /home . htmlH 

Please note that SVT [J algorithm was originally proposed for solving the matrix completion problem. The 

algorithm we used in our numerical study is an adaptation of the SVT algorithm by Wright and Rao at 

the Perception and Decision Laboratory in University of Illinois, Urbana-Champaign to solve robust PCA 

problem. 

We created 10 random problems of size n = 500, i.e. D £ r500x500 ^gjj^g i^j^g procedure described in 
Section 14.11 where p is set to 0, i.e. Yq = 0. We chose parameter values for each of the four algorithms 
so that they produce a solution Xsoi and Ssoi with relative- infeasibility approximately equal to 5 x 10^^, 
i.e. - — ''° yj)-|°'~ — — w 5 X 10^^. For each algorithm we set the parameters by solving a set of small size 
problems and these parameter values were fixed throughout the experiments, all other parameters are set to 
their default values. The termination criteria are not directly comparable due to different formulations of 
the problem solved by different solvers. For FALC we attempted to set the stopping parameter g such that 
on average the stopping criterion for FALC is tighter than the stopping criteria of all the other algorithms 
we tested. 

1. FALC: Problem (14. ip is a special case of Problem (jl.ip with p ~ 0. Therefore, f'^''\X,s,y) defined in 
(1^:^ simplifies to f'^^\X, S) = \\\ vec(X + S) - vec(L>) - A('=)6iJ'''||i (note that for aU A: > 1, 6*^' = 0). 
We set Cr = 0.4, c^ == 0.4, ca = 0.4, c^ = 0.999, c^ = 0.999, ca = 2 and initialize d'f' as in ^, i.e. 

^1^^ = 7\^ -^^^ ^^ ', , ,,„ , wec{sign{D)) . (4.2) 

max{\\stgn{D)\\2, y'n\\ vec{sign[D))\\oc\ 

Finally, we set gi = 1 x 10^^ and terminate FALC when the distance between successive inner iterates 
are below the threshold g for each component, i.e. \\Xl — X{ ' ~ '\\p < g and \\sl — s^ ' " II2 < £* 
for any fc > 1. We used PRO PACK [23] for computing partial singular value decompositions. In order 
to estimate the rank of Xq, we followed the scheme proposed in Equation (17) in [25]. The code for 
PRO PACK is available at [http : //soi . S tanford . edu/-rinunk/ PROPACK/ . 

2. I-ALM: LALM solves min{||X||, + ^|1 vec(S')||i : X + S = D}. Let (XC^), s'*^)) be the k-th iterate. 

I-ALM terminates when II^"°'+^''''-J'IIf < ^ ^ -^q-s^ 

3. APG: For some A > 0, APG solves mintx (\\X\\^ + ^\\vec{S)\\i) +^\\X + S~D\\%\. Stopping 

tolerance is set to 5 x 10^^^ (the definition of stopping criteria is complicated, for details see partial APG 
code at [http: //perception, csl .uiuc. edu/matrix-rank/home ■htinl[ . In the code, by default A is set 
to a,„ax(^) X 10-^ 

4. SVT: SVT solves a relaxation of the robust PCA problem, 

min{A(||X||* + ^||vec(5)||i) + ^{\\X\\% + \\S\\%) : X + S = d\. Let (xW,s('=)) be the /fc-th iterate 

— „ II vC^) _(_ eC^) njj /I 

when A is set to 1 x 10 . SVT terminates wrn; '— < 5 x 10^ . Please note that we have chosen a 

weaker stopping criterion for SVT. 
The results of the experiments are displayed in Tables ICT14. 21 In Table I1TTH4.2[ the row labeled CPU lists 
the running time of each algorithm in seconds and all other rows are self-explanatory. The column labeled 
average lists the average taken over the iV = 10 random instances, the columns labeled min (resp. max) list 
the minimum (resp. maximum) over the 10 instances. The experimental results in Table lLT114.2| show that 
FALC is competitive with the state of the art algorithms, e.g. I-ALM, APG and SVT, specialized for solving 
robust PCA problem. Even though FALC is not special purpose algorithm for robust PCA, in our numerical 
experiments, we requires FALC requires fewer singular value decompositions when compared to APG and 
SVT. In addition, for all 10 randomly created problems in the test set, only FALC and I-ALM accurately 
identified the zero-set of the sparse component 6*0, i.e. /q = {(ij) G {l,2,...,n} x {l,2,...,n} : {So)ij = 0} 
without any thresholding. This feature of FALC is very appealing in practice. For signals with a large 
dynamic range, almost all of the state-of-the-art efficient algorithms produce a solution with many small non 
zeros terms, and it is often hard to determine the threshold. 
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Experiment Results for n = 500, 



Table 4.1 
: 0.05^2, p = 0.05n and 



\\D\\f 



5 X 10" 





FALC 


I-ALM 1 




Average 


Min 


Max 


Average 


Min 


Max 


svd # 


42.6 


40 


45 


31.6 


30 


33 


1 Xsoi — XoIIf/ XoIIf 


4.65E-09 


2.28E-09 


7.04E-09 


1.85E-09 


5.90E-10 


3.40E-09 


1 Ssoi - SoIIf/ SoIIf 


1.79E-07 


8.89E-08 


2.69E-07 


1.94E-07 


4.80E-08 


3.85E-07 


Xsolll* — Xo «|/||Xo||, 


1.88E-10 


4.74E-11 


4.15E-10 


1.13E-11 


3.67E-12 


2.07E-11 


max{|(Ti -a\>\ : cr\' > 0} 


2.61E-07 


l.OlE-07 


5.15E-07 


8.69E-08 


2.34E-08 


2.54E-07 


max{|(Ti| : o-j* = 0} 


1.57E-13 


4.22E-14 


3.48E-13 


1.47E-13 


5.92E-14 


3.66E-13 


1 ||vec(S,oi)lli-||vec(Xo)||i|/||vec(Xo)||i 


1.97E-08 


7.44E-09 


3.02E-08 


2.24E-09 


4.13E-10 


5.11E-09 


max{|(S,ol)ij-(So)ij|:|(So)ij|>0} 


1.31E-06 


5.99E-07 


1.75E-06 


1.07E-05 


2.31E-06 


2.45E-05 


max{|(S,oi)ij| : (So)ij = 0} 




















rank 


25 


25 


25 


25 


25 


25 


||X,o1 + S,o1-D||f/||D||f 


4.67E-09 


2.31E-09 


7.04E-09 


4.66E-09 


1.08E-09 


9.63E-09 


CPU 


23.6 


19.4 


32.3 


15.9 


12.0 


24.4 



Table 4.2 



Experiment Results for n = 500, r = 0.05n 



: 0.05n and 



i+S,. 



5 X 10" 



APG 



SVT 



Average 



Min 



Max 



Average 



Min 



Max 



svd # 



187.7 



187 



188 



833.9 



819 



857 



l|Xsoi — Xo||f/||Xo||f 



4.14E-09 



3.99E-09 



4.39E-09 



1.79E-04 



1.76E-04 



1.80E-04 



l|Ssoi — So||f/||So||f 



1.63E-07 



1.57E-07 



1.72E-07 



2.04E-02 



2.02E-02 



2.08E-02 



-IIXoll 



IIXol 



3.96E-09 



3.82E-09 



4.20E-09 



1.66E-05 



1.53E-05 



.{|<xi-a,^|:a^>-0r 



1.85E-05 



1.99E-06 



1.90E-06 



2.11E-06 



1.45E-02 



1.17E-02 



1.68E-02 



max{ I ( 



0} 



1.26E-13 



6.84E-14 



1.91E-13 



2.39E-13 



7.58E-14 



6.79E-13 



I II vec(Sgoi)||i - II vec(Xo)||i|/|| vec(Xo)||i 



1.83E-07 



1.76E-07 



1.92E-07 



5.03E-03 



4.89E-03 



5.14E-03 



max{|(Ssol)ij - (So)ijl : |(So)ij| > 0} 



1.95E-07 



1.80E-07 



2.25E-07 



1.19E-01 



1.07E-01 



1.33E-01 



max{|(Ssol)ij| : (So)ij = 0} 



3.70E-08 



2.09E-08 



;.64E-08 



5.50E-03 



3.59E-03 



1.45E-03 



rank 



25 



25 



25 



25 



25 



25 



+ S,oi-D||f/||D|| 



5.43E-09 



5.24E-09 



5.77E-09 



4.99E-04 



4.98E-04 



5.00E-04 



CPU 



87.7 



71.6 



101.6 



265.2 



252.0 



273.1 



4.3. Stable Principle Component Pursuit Problem. In this section, we solve the problem 



subject to 



|X||*+//2||vec(5)|li, 
|vec(X + 5-i^)||oo<P, 



(4.3) 



and report the results of our numerical experiments using FALC. To best of our knowledge, there are no 
publicly available code specialized for solving Problem ()4.3p . other than general purpose SDP solvers. 

We created 10 random problems of size n = 500, i.e. D e r500x500 ^gjj^g w^^ procedure described in 
Section [4 .11 where p is set to 1 x 10"*, i.e. each entry of the noise term Yq is coming from a uniform distribution 
between [— p, p\. We chose the value of the stopping parameter so that FALC produces a solution Xsoi and 



Ssoi with 



\\X,„i+S,al-D\\F 



1 X 10- 



Problem (j4.3p is a special case of Problem (|1.1[) . Therefore, f- '{X,s,y) defined in (|2.2p simplifies to 
f''\X,S,y) = \\\ vec(A: + S) + y - vec(L>) - \^''^ef\\l (note that for aU A: > 1, 6*^ ^ = 0). We set the 
parameter values for FALC by solving a set of small size problems and these parameter values were fixed 
throughout the experiments, all other parameters are set to their default values, i.e. Ct = 0.4, cj = 0.4, 
ex = 0.4, Cr = 0.999, c^ = 0.999. We set cx = 1.5 and initialize of'^ as in [35], i.e. as in (IT^ . 

Finally, We set p = 1 x 10~^, <j = 1 x lO^'^ and terminate FALC when either the distance between succes- 
sive inner iterates are below a threshold g for each component, i.e. || vec I X} ) — vec I A"} ' ) ||oo < £>, 

2 < g for any k > 1 oi there exist partial subgradients with sufficiently small 
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vec 



{s['^'') 



— vec 



,(fc.^-i) 



norm 



for each component, i.e. \\G\\f < ^/2, ||.g||2 < 'i for some {G,g) G dx,sP^''H-, ■, ■)\ix(''-^'i C"-*) (''■"i) ^^^^ 



{X[^ 



We have used PROPACK [23] for computing partial singular value decompositions. In order to estimate the 
rank of Xq, we followed the scheme proposed in Equation (17) in _25i . The results of the experiments are 
displayed in Table [ 



Table 4.3 
Experiment Results for n = 500, r = 0.05n^, p = 0.05n, p = 1 X lO^'' and 



\X,a,+S,„l-D\\F 

WdWf 



1 X 10" 





FALC 1 




Average 


Min 


Max 


svd # 


53.8 


49 


60 


1 Xsoi - Xo||f/||Xo||f 


1.71E-05 


1.66E-05 


1.83E-05 


Ssol - So f/||So||f 


4.04E-04 


2.61E-04 


9.11E-04 


l|X,o,||, -||Xo|U|/||Xo||. 


1.59E-05 


1.56E-05 


1.61E-05 


max{|o-i-o-;J| : o-;^ > 0} 


9.70E-03 


9.43E-03 


l.OlE-02 


max{|cri| : o-^ = 0} 


1.56E-09 


1.37E-10 


7.97E-09 


1 ||vec(S,o,)||i-||vec(Xo)||i|/||vec{Xo)||i 


2.35E-04 


2.12E-04 


3.11E-04 


max{|(S,oi)ij-(So)ij| : |(So)ij| > 0} 


3.83E-03 


1.26E-03 


8.63E-03 


max{|(S,oi)ijh(So)ij=0} 











rank 


25 


25 


25 


I|X,o1 + S,o1-D|1f/|1D||f 


2.15E-05 


1.95E-05 


2.89E-05 


CPU 


35.3 


30.8 


44.6 



5. Extensions and conclusion. The algorithmic framework proposed in this paper extends to the 
following much more general class of problems: 



minxeK^x^ Milk(^)IU + ^^2\\C{X) - d\\p+ < R,X > +if\\X - Xo\\% 
subject to A{X) — b, 
T{X) ^ G, 
WGiX) ~h\\,<p 



(5.1) 



where the matrix norm |]cr(.)||Q denotes either the nuclear norm, the Frobenius norm, or the ^2-norm, the 
vector norms ||.||^ and |j.||-y denote either the ^i-norm, £2-norm or the €oo-norm, and A{.), C(.), G{-) and J^{.) 
are linear operators from M™^" to vector spaces of appropriate dimensions. By introducing slack variables, 
(|5.ip can be reformulated as follows. 



min^fgR^x^ fii\\a{X)\\„ + fi2\\sch+ < R, X > +lf\\X - Xo\\%, 
subject to A{X) = b, 

CiX) + s, = d, 

jF{X) + Sf=G, Sfh 0, 



(5.2) 



6iX) 



Sg = h, 



'Sll7 



<P, 



The FALC framework extended to this more general problem inexactly solves optimization problems of the 
form: 



X,. 



mm 

||Sg||-^<P,5//^0 



f AW(Mi|k(X)|U+A*2||sc||^+<-R,X>+f llX-Xolll) ] 
-X('^\e[''Y{A{X)-b) + ^\\A{X)-b\\l 

- AW(^^''y (C(X) + sc -d) + i||C(A) + s-d\\l 

- AW(0f ^(^(A) + Sf~G) + i||-F(A) + Sf- G\\l 

- AW(^f y((?(A) + ,s, -h) + i||(?(A) + Sg - h\\l 



Note that we do not dualize neither the norm constraint \\sg\\ < p nor the cone constraint Sf > 0. FALC 
solves (|5.3p by solving constrained shrinkage problems of the following form. 
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1. Matrix optimization problem over simple sets: 

mm {s\\a{X)\\^ + i||x - y|||, : \\a{X)\U < v}, (5.3) 

imn[^\\Sf-Y\\l: Sfho}. (5.4) 

For a given Y £ R™^", these problems can be efficiently solved when ||cr(.)||Q is the either the 
the nuclear norm, Frobenius norm, or the ^2-norm, or equivalently, the £i, £2 or lo^ norm of the 
singular values of X. Note that subproblem given in (|5.4p is only needed when solving the augmented 
Lagrangian subproblem corresponding to J^{X) ^ G constraints. 

2. Vector optimization problem over simple sets: 

mm|(5||a;||^ + -llx-yllaj, (5.5) 

min|-||sg-y||^ : \\sg\\^ < pj. (5.6) 

For a given y, these problems can be efficiently solved when /3 and 7 are either £2, £1 or £00 vector 
norms. Note that subproblem given in (J5.6I) is only needed when solving the augmented Lagrangian 
subproblem corresponding to ||5(X) — h\\-y < p constraints. 
The extension (|5.1|) allows us to model a wider class of problems. Setting A = J- = and 7 = 2 results in 
a special case that includes matrix completion problems with noisy data. Setting /3 = 00, a = 2 dropping 
the norm constraint ||t/(X) ^ h\\^ < p, results in a special case that arises in the optimal acquisition basis 
design for compressive sensing. 

The main contribution of this paper is an efficient first-order augmented lagrangian algorithm (FALC) 
for the composite norm minimization problem (jl.ip and by extension (I5.ip . The FALC recovers the low rank 
target matrix by solving a sequence of augmented lagrangian subproblems, and each subproblem is solved 
using Algorithm 3 in [33]. We show that the continuation scheme on penalty parameter A used in FALC 
provably converges to the target signal and we are also able to compute a convergence rate. The performance 
of FALC in our limited numerical experiments has been very promising. 
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Appendix A. Auxiliary results. 

Theorem A.l. Let f : M™><" x M^ x K'* — > R denote a convex function with a Lipschitz continuous 
gradient V/ with a Lipschitz constant L with respect to the norm \\.\\ on [JJ™^" x IR^ x M"? defined as follows: 



\iX,s,y)\\ ^ ^\\X\\l + \\s\\l + \\y\\l (A.l) 



Let {X,,s,,y,) G &rgmm{X{fii\\<j{X)\\^ + fi2\\s\\fi) + f{X,s,y) : X e R'"><",s G W,\\y\\^ < p}. Suppose 
{X,s,y) G R"^" X RP X R« such that \\y\\^ < p satisfies 

for some e > 0. Then 

\\Vxf{X,-s,y)\\F<{V2Le + I{a*)\p,), ||VJ(X, s,y)||2 < (V2li + J(/?*)A^2) 

where /(•) and J(-) are defined in \2A\ . a* and /3* denote the Holder conjugate of a and f3, respectively. 

Proof. Since V/ is Lipschitz continuous with constant L, the triangular inequality for ||cr(.)||a and \\.\\(3 
implies that for any F G R"""", g G R^ and z G M« 

X{fii\\a{Y)\\^ + P2\\qh) + f{Y,q, z) < X{fH\HX)\U + t^2\\4p) + f{X,s,y) + Xipi\\aiY - X^^^^ 

+ (Vx/(A, s, y),iY~ X)) + V,/(A, s, yfiq ~ s) + V,/(A, s, yf{z - y) 

where {X,Y) = Tr{X'^Y) G R denotes the usual EucUdean inner product of X G R™""" and Y G R™^". 
Since Y , q and z are arbitrary, it follows that 

X{pi\\a{X,)\\a + P2\\s4p) + f{X^,s^,y^) < X{pi\\cr{X)\\^ + p2\\sh) + f{X,s,y) 

+ ^mm ^ |(Vx/(X, s,y), Y - X) + ^\\Y - XWJ, + Xp,\\a{Y - X)\\A 

+ min I Vs/(A, s, yf{q - s) + -\\q - s\\l + Xn2\\q - s||,3 

+ min hyfiX,s,yfiz~y) + ^\\z-y\\l\. (A2) 

The first minimization problem on the right hand side of ()A.2p can be simplified as follows: 

min |(Va-/(A, s,y),Y - X) + ^\\Y ~ X\\l + Xpi\\<j{Y - X)\\ 

'-\\Y- X\\l + (WxfiX, rs, y) + W,Y-X)\, (A.3) 

= max \^\\Y*{W)-X\\% + {^xf{X,s,y) + W,Y*{W)^X)], 

.- min \\^xfiX,-s,y) + Wn^ 

W:\\a{W)\\^,<Xlii 2L 

Y*(W) — X ^^ j^'^' is the minimizer of the inner minimization problem in (jA.3p . 
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' L, 
max mm 

W:\\a{W)\\^,<Xiii yeR'">^ 



The second minimization problem on the right hand side of (jA.2[) can be simphfied as fohows: 

min I VsfiX, s, y)^ {q - s) + -\\q - s\\l + \^i2\\q ~ s\\fi 
= „ !P^5^ min j^||g-s||2 + (V,/(X,s,y) + «f(q-s)|, (A.5) 

u:\\u\\i^t<\fj.2 q£K.P [^ Z J 

= max {^\\q*{u)-s\\l + {VJ{X,s,y)+uf{q*{u)-s)], 

\\Vsf{X,s,y)+u\\l 

= — mm — , (A. 6) 

tiilltijj^* <A/i2 ZL/ 

q*{u) = s " j^'^ — - is the minimizer of the inner minimization problem in (IA.5I) . 

Since WyW-y < p, then the following is true for the third minimization problem on the right hand side of 

(EH. 

min lwyf{X,s,yf{z-y) + ^\\z~y\\l\<0. (A.7) 

z:\\z\\^.<p I Z ) 

Thus, ((X2|l . (IX4l) . (|X6l) and (|X7| together imply that 

X{fii\\a{X^)\\a +/^2||s*||/3) + /(A:*,s*,2/*) < X{fii\\a{X)\\a + fJ,2\\s\\i3) + f{X,s,y) 

\\VxfiX,rs,y) + W\\l 
mm 



W:\\a{W)\\^*<\fii 2L 

\\VJ{X,s,y)+u\\l 
mm 



Since f A(pi||cr(X)||Q +p2|ls|l^) + /(X, s,y)j - f A(pi||CT(X*)||a +p2||s*||/3) + /(-'^*, s*,2/*)j < e, we have that 
min ||Vx/(X,s,y) + W^||2,+ min \\^ sf{X ,s,y) + u\\l < 2Le. (A.8) 

H'rllcr^TV)!!^. <A/ji «:||«||^. <A^t2 

From the definition of /(.) in ^^, it follows that ||VK||i=^ < I{a*)\\a{W)\\a' ■ Thus, ([XS]) implies that 

wnwn^JJyr.u l|Vx/(^, .-,y) + W^|||, < 2Le. (A.9) 

W:\\W\\p<I{a*)\fj.i 

Suppose WS/ xf{X,s,y)\\F > I{a*)X^i. Then the optimal solution of the optimization problem in (IA.9I) is 

Vxf{X,s,y) 



W* ^ -I{a*)Xni 



\VxfiX,s,y)\\ 



Then dXU implies that {\\VxfiX^s,y)\\F-I{a*)X^ii)^ < 2Le, i.e. \\V xf{X,s,y)\\F < V2U + Iia*)Xni. 
This is trivially true when \\S/xf{X, s, j/)||_f < I{a*)X^i. Therefore, we can conclude that always 

\\Vxf{X,s,y)\\F < V2re + I{a*)Xtii. 

A similar analysis establishes that ||Vs/(X, s, y)||2 < V2Le + J{j3*)X^2- D 
Corollary A. 2. Lei a, /3 e {1,2, 00} anrf 

P(X, s, y) = X{^l4a{X)\U+^l2\\s\\|3) + f{X, s), f{X, s, y) = ^\\A{X)+y-b-Xe^\\l + ^\\C{X) + s-d-Xe2\\l 

where A : M™""" -^ W and C : R'"^" -^ W denote linear AiX) = A wec{X), C{X) ^ C vec(X) for all 
X G R"'^", where A e R9xm" ^^i^ (j £ ^pxmn gj,g ^^g matrix representation of the linear maps A{.) and 
C{.), respectively; and vec(A') denotes the vector obtained by stacking the columns of X in order. 
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Suppose {X,s,y) is e- optimal for the problem imii{ P (X, s,y) : X G R™^",s G R'',||y||-y < p}, i.e. 
0<P{X,s,y)~ min PiX,s,y)<e, 

XeR'">^'^,seRp,\\y\\y<p 

and the matrix A has full row rank. Then 

llAtV\^- h \fl II ^ -^(«*)M1 +C^max(C)J(/3*)/J2 , , gma^ (M) (1 + CT^g^ (C)) /^^ 

\\A(X) + y- b- A0i\\2 < TjT AH --— V2e, 

\\C{X) +S^d- X92\\2 < Ji/3*)fi2X + arnaAM)V2'e, 

where (Tmax{X) and (Jmin{X) denote respectively the maximum and minimum singular values of a matrix X; 
M = ( . j , and I{a*), J{(3*) are defined in ^^ . 

Proof Let fiX,s,y) = ^\\A{X) + y - b ~ X0i\\l + ^\\C{X) + .s-d- A^zHi and ||.|j be the norm on 
jgmx« X RP X ]R« defined in (jXU, tlien for any Xi, Xa G R™^", si, S2 e K'' and yi,y2 € R«, we liave 

||V/(Xi,si,yi)-V/(X2,S2, 2/2)11' 

= ||(Vx/(Xl,Sl,2/l)-Vx/(X2,S2,2/2), V,/(Xi,Si,yi)-V,/(X2,S2,y2), V^/(Xi,Si,yi)-V,/(X2,S2,y2))|P, 

= \\Vxf{X,,si,y,)-Vxf{X2,S2,y2)\\l + ||V,/(Xi, si,yi) - V,/(X2, S2, y2)||^ + \\Vyf{XuSi,yi) ~ Vyf{X2,S2,y2)\\l 
= \\A*{A{Xi - X2) + yi- 2/2) + C*(C(Xi - X2) + si - S2)||^ 

+ ||C(Xi - X2) + si - S2\\l + M(Xi - X2) +yi- y2\\l 
- \\A^{A vec(Xi - X2) + 2/1 - y2) + C7^(C vec(Xi - X2) + si - S2)||2 

+ \\C vec(Xi - X2) + si - S2||^ + ||A vec(Xi - X2) + yi - 2/2||2, 

Sl - S2 

lAf^Afl yi-2/2 I 11^. 

vec(Xi - X2) 



Hence, 
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||V/(Xi,si,2/i)-V/(X2,S2,y2)||<<,,(M) II I yi-2/2 | | 

vec(Xi - X2) 



2, 



= <ylaAM) ^J\\Xl - X2\\l + hi - S2||2 + ||yi - y2||i, 
= <^'Lax{M) ||(Xi,Si,yi)-(X2,S2,y2)||, 

wfiere (Jmax{M) is tlie maximum singular-value of M . Thus, / : R™><" x R^ x R"* — > R is a convex function 
and V/ is Lipschitz continuous with respect to ||.|| with Lipschitz constant L — a'^^^{M). 

Since [X,s,y) is an e-optimal solution to the problem m\ii{P [X , s , y) : X G M™^",s G R^, HyH-y < p}, 
Theorem lA.ll guarantees that 

||Vx/(X,s,y)||i. = \\A*{A{X)+y~b~\ei)+C*{C{X) + s-d-\e2)\\F < V2'eamaxiM) + Iia*)Xfii, (A.IO) 

and 

||V,/(X, s, y)h = \\C{X) + s-d- X62\\2 < V2'e a„„,(M) + J(/3*)A/i2. (A.ll) 

The bound (|A.10I) and the triangular inequality for Frobenius norm implies that 

\\A*{A{X) + y-b~ Xei)\\F < ||C*(C(X) + s-d- A02)||f + V2^ <Jrnax{M) + I{a*)Xfii. (A.12) 
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Since \\C*{C{X) + s - d- Xe2)\\F < cr„^ax{C) |1C(X) + s - d - A6I2II2, (1X121) implies that 

\\A*{AiX) -y-b- X9,)\\f < <J,^a,{C){V2ea^,,{M) + J(/3*)Aa*2) + V2e a^a^M) + I{a*)\ii^. 
Consequently, 

\\A{X) +y-b-Xei\\2< ^ • \\A*{A{X) + y~b~ A0i)||j., 

< i-rz- (<ymaxiC){V2'ea„^axiM) + J{P*)\^i2) +V2'ea,naAM)+Iia)Xfii). 

O-min(A) V ' / 

D 

Lemma A . 3 . Let {A4 , 1 1 . 1 1 ) be a normed vector space, f : A4 -^ R be a strictly convex function and x (^ M 
be a closed, convex set with a non-empty interior. Let x = argmin^g /(a;) and x* ~ argmin^g^ /(a;). // 
^* ^ X; then X S bdx, where bdx denotes the boundary of x- 

Proof. We will establish the result by contradiction. Assume x is in the interior of x, i-e- x G int(x)- 
Then 3 e > such that B{x,e) — {x £ Ai : ||a; — x\\ < e} C x- Since / is strictly convex and x* 7^ x, 
f(x*) < f{x). Choose < A < || -_^^. || < 1 so that \x* + (1 - A).-e e B{x, e) C x- Since / is strictly convex, 

f{\x* + (1 - \)x) < Xf(x*) + (1 - X)f{x) < fix). (A.13) 

However, Xx* + (1 — A)a; € B{x, e) C x and f{Xx* + (1 — A)a;) < f{x) contradicts the fact that f{x) < f{x) 
for all X € X- Therefore, x ^ int(x). Since a; S Xj it follows that x € bdx- □ 

Next, we collect together complexity results for optimization problems of the form minj^gRmx7i{(5||(T(A)||a 
+ i||A — F|||, : ||(7(A)||q < 77} and minsgRp{(5||s||^ + i||s — q\\\ : ||s||^ < 77} that need to be solved in each 
FALC update step. 

Lemma A. 4. Let X* = argmin{(5||cr(A)||„ + i||A-r|||, : \\a{X)\\a < V,X £ R™><"} of the constrained 
matrix shrinkage problem. Then 

X* = C/diag(s*)F^, 

where U d\a.g{a)V , a € K!j_, denotes the SVD ofY and s* denotes the optimal solution of the constrained 
vector shrinkage problem 

min{(5||s||„ + -||s-r7||2 : ||s||„ <7/,ser}. 

Since the worst case complexity of computing the SVD of Y is 0{inin{n^m, m^n}) the complexity of the 
computing X* is 0(m.m.{n^m,m'^n} + Ty{r,a)), where T^(r,a) denotes the complexity of computing the 
solution of an r-dimensional constrained vector shrinkage problem with norm ||.||q. The function 

Proof. The standard results in non-linear convex optimization over matrices implies that X* is of the 
form X* = Udiag{s*)V'^ (see Corollary 2.5 in gl]). 

Now, consider the vector constrained shrinkage problem min;j^gRp {(5||a::||^ + i||x — 2/II2 • II^^H^ < 77}- 
(i) /3 = 1: See Lemma A. 4 in [1 . 

(ii) /3 = 2: First considered the unconstrained case, i.e. rj — 00. Since ^2-norm is self dual, (5||a:;||2 = 
maxju^x : ||u||2 < 1}. Thus, 

min U||a;||2 + -||a;-y||^ \ = min max J u^a; + -|la; - y|l^ 

xSK" [^ Z j xGK" u: ||ti||2<o ^_ Z 

T II II 2 

— max mm < li a; H — a; — w U 

I II ^ c ^^rsn I Oil c 1 1 z 



max_<{ M'^(M-y) + -||w||^ !>, (A. 15) 



1 

Z u: \\u\\2<:0 Z 

28 



where (JA.ISP follows from the fact that x*{u) :— argmin^ u^x + ^\\x — y\\2 = y — u. Define 

■ l|l 112 ■ f S 

u := argmm ttII" - y|l2 = 2/ ™in<^ y"^' 

li: \\lih<S ^ I l|y||2 



Then the unconstramed optimal solution x = x*{u*) — j/max -^ 1 — yit, > and the complexity of 

computing x is 0{p). 

Next, consider the constrained optimization problem. The constrained optimum x* — x, whenever x 
is feasible, i.e. ||x||2 < /3. Since f{x) :~ ^\\x\\2 + -^W^ ^ 2/II2 i^ strongly convex. Lemma [A . 31 implies that 
||a;*||2 = r] whenever ||a:||2 > V- Thus, 



{'5||2^ll2+ 2l|2;-y|l2 : Ikll2 < vj ='577 + min|-||. 



Il2 II ||2 2 

2/II2 : M =V 



The unique KKT point for the optimization problem min I 2 ||a;—y|| 2 : \\x\\'^ = 77^}, is given by a;* ~ ifhi'V 
and KKT multiplier for the constraint ||x|p = 77^ is 1? = ^^^ — 1. It is easy to check that 1} > whenever 

||a;||2 > /3. Thus, a;* is optimal for the convex optimization problem min{min -^ 2 II ^^^ 2/ II 2 ■ ll-^lP 1!^V^(j 

consequently, optimal for equality constrained optimization problem min {ilia; — j/|| 2 : ||a;|| =?/}. Hence, 
the complexity of computing x* is 0{p) 
(iii) /3 = 00: First consider the unconstrained problem. Since £i-norm is the dual norm of the ^co-norm, we 
have that 

min U||a;||co + Tr||a;-y||2 \ = min max \ u^ x + -\\x ~ y\\l 

= max min < u x ~\ — lla; — ylU 
u: ||u||i<d- xeR" \ 2" ^"^ 

= max Au^{y~u) + -\\u\\l\, (A.16) 

u:||u||i<dl Z J 

II Il2 • 1 II ||2 

= :^l|y|l2- min^,oll"~^ll2' 

Z u: ||u|| 1 <o Z 

where (IA.16P follows from the fact that argminu-^a; + ^Ha; — y||| = y — u. The result in (i) now implies 
that complexity of computing u* = min„. ||u||i<5 |||u — ylH with C'(nlog(n)). Thus, the unconstrained 
optimal solution a; — x*{u*) = y — u* can be computed in C'(plog(p)) operations. 
Next, consider the constrained optimization problem. The constrained optimum, a;* = a: whenever x 
is feasible, i.e. ||a;||oo < V- Since f{x) — A||a;||oo + ^lla; — J/Hi is strictly convex, Lemma IATsI implies that 
||a;*||oo = ??, whenever ||a;||oo > V Therefore, 

min|A||a;||oo + -||a;-?;||2 : ||a;||oo < vj = M + min |-||a; - j/jl^ : ||a;||oo = vj 

It is easy to check sign(a;*) = sign(yi) for alH = 1, . . . , n. Thus, 

min|-||a;-y||^ : l|a;||oo = v] =min|-||a:- |y| ||^ : < Xi < yyj. 

Note that we are guaranteed that maxi{xi} = ry because ||S||oo > rj. The optimal solution of the 
1-dimensional problem argmin{i(x — |y|)^ : < a; < 77} = min{|y| ,r/}. Thus, it follows that x* = 
sign(y) min{|y|, 77I}, denotes componentwise multiplication and 1 is a vector of ones, and the 
complexity of computing a;* is 0{phi{p)). 



29 



